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1 '■ ABSTRACT 

' ^ , Aims. The separation of foreground contamination from cosmic microwave background (CMB) observations is one of the 
^ ■ most challenging and important problem of digital signal processing in Cosmology. In literature, various techniques have 
^sj ' been presented, but no general consensus about their real performances and properties has been reached. This is due to the 
, characteristics of these techniques that have been studied essentially through numerical simulations based on semi-empirical 
fl") ■ models of the CMB and the Galactic foregrounds. Such models often have different level of sophistication and/or are based 
on different physical assumptions (e.g., the number of the Galactic components and the level of the noise). Hence, a reliable 
«^ . comparison is difficult. What actually is missing is a statistical analysis of the properties of the proposed methodologies. Here, 
we consider the Internal Linear Combination method (ILC) which, among the separation techniques, requires the smallest 
number of a priori assumptions. This feature is of particular interest in the context of the CMB polarization measurements at 
• • ■ small angular scales where the lack of knowledge of the polarized backgrounds represents a serious limit. 

Methods. The statistical characteristics of ILC are examined through an analytical approach and the basic conditions are fixed 
in a way to work satisfactorily. A comparison with the FastICA implementation of the Independent Component Analysis (ICA) 
method, one of the most celebrated techniques for blind signal separation, is made. 
Cd Results. ILC provides satisfactory results only under rather restrictive conditions. This is a critical fact to take into consideration 
in planning the future ground-based observations (e.g., with ALMA) where, contrary to the satellite experiments, there is the 
possibility to have a certain control of the experimental conditions. 
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1. INTRODUCTION 

The experimental progresses in the detection of cosmo- 
logical and astrophysical emissions require a parallel de- 
velopment of data analysis techniques in order to extract 
the maximum physical information from data. An exam- 
ple of very interest is represented by signals that are the 
mixture of the emission of distinct physical mechanisms. 
The study of the underlying physical processes needs the 
separation of the different components that contribute 
to the observed signals. This is the case of the Cosmic 
Microwave Background (CMB) observations where there 
is the necessity to separate the CMB from diffuse fore- 
grounds originated by our own Galaxy. In this context, an 



extensive literature is available a nd many approaches have 



been proposed (for a review, see iDelabrouille &; Cardoso 



2007). However, no general consensus about their real per- 



formances and properties has been reached. This is due to 
the approach followed to determine the characteristics of 
these techniques that is based on numerical simulations 
that make use of semi-empirical models of the CMB and 
the Galactic foregrounds. The point is that such models of- 
ten have different level of sophistication and/or are based 
on different physical assumptions (e.g., the number of the 
Galactic components and the level of the noise). Hence, 
a reliable comparison is difficult. A trustworthy assess- 
ment of the real capabilities of such methodologies should 
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require a rigorous analysis of their theoretical statistical 
characteristics independently of the specific context where 
they are applied. However, at least in our knowledge, in 
literature not hing has never be en presented in this sense 
(however, see lSaha et. al 112007 for a discussion concerning 
the power spectrum). Things become even more serious 
in the case of CMB polarization measurements, where the 
available a priori information is quite limite d and the use 



of bl ind separation techniques obligatory (jStivoli et al. 
20061) . 

A situation as this one is clearly unsatisfactory. 
This also because in a near future some innovative 
ground-based experiments are planned for polarization 
observations at extraordinary high spatial resolution as 
with the Atacama Large submmillimctrc/millimetre ar- 
ray (ALMA). One important advantage of these experi- 
ments is that, contrary to the satellite observations, they 
will allow a certain control of the experimental conditions. 
Hence, a fully exploitation of the capabilities of the in- 
struments implies a careful preparation of the observa- 
tions in such a way the obtained data are of sufficient 
quality to permit an effective application of the chosen 
separation methodology. For this reason, in this work we 
start exploring the capabilities of algorithms aimed at a 
careful subtraction of the foreground sources when the 
amount of available a priori information is limited (an 
expected situation for polarization observations). In par- 
ticular, we consider one of the most used approaches to 
the separation of different emissions, the internal linear 
combination method (ILC), which was adopted for in- 
stance in the reduction of the data from the Wilkinson 
Microwave A nisotropy Probe (WM AP) satellite for CMB 
observations ( Bennett et al. Il2003l ). Among the separation 
techniques, ILC requires the smallest number of a priori 
assumptions. For the reason presented above, we do not 
perform an application to any astrophysical dataset here, 
but rather we study the general properties of this tech- 
nique in order to fix the conditions under which it can be 
expected to produce reliable results. We also make a crit- 
ical comparison with the architecture and capabilities of 
the Fastlca implementation of the ICA approach, which 
is the other main handle to t he separation when little a 



priori information is available ( Stivoli et al. 2006f) 



In the following, the available data are assumed in form 
of N Q maps, taken at different frequencies, containing N p 
pixels each. More precisely, if (p) provides the value of 
the pth pixel for a map obtained at channel " i " 0, our 
starting model is: 



s^(p)^e^(p) + s^( P )+M^( P ) 



(i) 



where ©^(p), S f W (p) and M (l \p) are the contributions 
due to the CMB, the diffuse Galactic foreground and the 
experimental noise, respectively. Although not necessary 



1 In the present work, p indexes pixels in the classic spatial 
domain. However, the same formalism applies if other domains 
are considered as, for example, the Fourier one. 



for later arguments, it is assumed that all of these contri- 
butions are representable by means of stationary random 
fields. At least locally, in many experimental situations 
this is an acceptable approximation. If not, in any case it 
is often made since it permits a statistical treatment of the 
problem of interest and the results can be used as bench- 
mark in the analysis of more complex scenarios. In the 
present context, this assumption holds on small patches 
of the sky. Finally, without loss of generality, for easiness 
of notations the random fields are supposed the realiza- 
tion of zero-mean spatial processes. In the present work 
the contribution of non-diffuse components (e.g., due to 
SZ cluster, point-sources, . . . ) is not considered and it is 
supposed to be already removed through other method- 
ologies. 

The paper is organized as follows: in Sec.[2]the relevant 
analytical formulas are introduced. In Sec.[3]the statistical 
framework of ILC in noiseless observations is presented 
and compared with that derived for FastlCA. Simulations 
are also presented. In Sec. 0] the case of noisy observations 
is considered. Finally, the conclusions are drawn in Sec. [3 

2. FORMALIZATION OF THE PROBLEM 

The idea behind ILC is rather simple. The main assump- 
tion is that model |T]) can be written as 



s^(p) = e c (p) + s^(p)+Af^( P ), 



(2) 



i.e. the template of the CMB component is independent of 
the observing channel. A natural idea to exploit this fact is 
to average N a images {S^(p)}^f 1 giving a specific weight 
Wi to each of them in such a way to minimize the i mpact 
of the foregrounds and noise (jBennett et al. Il2003h . This 
means to look for a solution of type 



© c (p)=5>;S»(p). 



(3) 



In fact, if the constraint 5Zj=i u>j = 1 is imposed, Eq. 
becomes 



N„ 



S c (p) = 6 c (p) +5>*Pf '(P) +W«(p)]. (4) 

i=l 

Now, from this equation it is clear that, for a given pixel 
"p", the only variable terms are in the summatory. Hence, 
under the assumption of independence of 6 c {p) from 
Si (p) and J\f( l \p), the weights {w^ have to minimize 
the variance of & c (p), i.e. 

{wi} = argmin 



VAR [© c (p)] + VAR 



N„ 



, (5) 



where VAR[s(p)] is the expected variance of s(p). 
If S"' denotes a row vector such as S^' = 
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[S«(l), S w (2), ■ ■ ■ , S' (i) (^p)] an d the iV x N p matrix S 
is defined as 

/ S« \ 

S (2) ^ 



(G) 



then Eq. |[5J) becomes 



(7) 



In this case, the weights are given by (jEriksen et al. 2004) 

(8) 



w = 



Cs 1 ! 



where Cs is the N a x iV D cross-covariance matrix of the 
random processes that generate S, i.e. 



E[SS 3 



(9) 



and 1 = (1,1,..., 1) T is a column vector of all ones. Here, 
E[.] denotes the expectation operator. Hence, the ILC es- 
timator takes the form 



© c = w 1 S, 

= al T C^S, 

with l T w — 1 and the scalar quantity a given by 

a=[l T C- s H]-\ 



(10) 
(11) 

(12) 



In practical applications, matrix Cs is unknown and 
has to be estimated from the data. Typically, this is done 
by means of the estimator 



(13) 



In this case, the ILC estimator is given by Eqs. (fTTj| - (fT2"f 
with C s and w replaced, respectively, by Cs and 



C s 1 

i T c«Y 



(14) 



Here, it is important to underline that if the observed 
maps are not zero-mean, they have to be centered before 
the computation of Cs- After that, the resulting weights 
can be applied directly to the original (i.e., non-centered) 
S. The computation of Cs is the only point where the 
fact of working with non-zero mean maps has to be taken 
into account. 

Although in the literature it might appear that the es- 
timator pip has optimal properties, actually this is not 
true. The point is that in Eq. ^ the term Sf(p) + 
■A/^j (p) is considered as a single noise com ponent (e.g., see 
lEriksen et aT1l2004lHmshaw et al~ll2007t) . In this way the 
problem is apparently simplified since it is reduced to the 
separation of two components only. No a priori informa- 
tion on this "global" noise is required. However, this ap- 
proach can lead to wrong conclusions. For example, since 



all the components in the mixtures S are assumed to be 
zero-mean, from Eq. |4]) one could conclude that 

E[e c |@ c ] = & c +w T V[S} = © c , (15) 

i.e. the ILC estimator is unbiased @. This is not correct: 
the claim that © c is unbiased requires to prove that 

E[© c |e c , S { ] = © c + w T S f + w T E[M] = © c . (16) 

The reason is that Sf is a fixed realization of a random 
process. There is no possibility to get another one. Even if 
observed many times (under the same experimental con- 
ditions) the Galactic components will appear always the 
same. Only the noise component A/" will change. This has 
important consequences. In order to discuss this issue, it 
is useful to start with the case of noiseless observations 
that can be thought to reproduce a situation of very high 
signal-to-noise ratio. In this case, model ([2]) becomes 



S {l] = © c + S 



(i) 



(17) 



3. STATISTICAL CHARACTERISTICS OF ILC: 
NOISELESS OBSERVATIONS 

A common assumption in CMB observations is that is 
given by the linear mixture of the contribution of N c phys- 
ical processes (e.g., free-free, dust re-radiation, 
...) 

N c 

(18) 



^ — ^ a,ij &j , 



with dij constant coefficients. In practice, it is assumed 
that for the jth physical process a template &j exists 
independent of the specific channel " i " . Although rather 
strong, actually it is not unrealistic to assume that this 
condition is satisfied when small enough patches of the 
sky are considered. Inserting Eq. (fT8|) into Eq. (fT7|) one 
obtains 

S = A&, (19) 



with 



© = 



/©c 

©1 

©2 



(20) 



\&Nc/ 



and 



A = 



\1 



an ai2 



aiN c \ 



0-21 0,22 



a 2N c 



(21) 



diV l ajy o 2 . . . aj^ o N c j 



Here, matrix A, assumed to be of full rank, is shown par- 
titioned in a way that will be useful for later calculations. 

In the following, three cases are considered that corre- 
spond to possible experimental situations. 



2 The expression E[a|6] indicates conditional expectation of a 
given b. 
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3.1. Case N = N c + 1 

If N = N c + 1, i.e. when number of observations is equal 
to the number of the components (CMB included), then 
A is a square iV D x N Q matrix. In this case, 



AC & A 



(22) 



with Ce the N Q x N cross-covariance matrix of the ran- 
dom processes that generate the templates ©, i.e. 

c 6 = E[ee T ]. (23) 

If this equation is inserted in Eq. (|llj) , one obtains 

© c = al T A- T C^}&, (24) 
with the scalar a given by 

a=[l T A- T C & 1 A- 1 l]-\ (25) 
' ' — ; (A _1 ) T . Now, since it is trivially verified that 



and A 



with 



it is 



ei 



e\A T 



= (1,0,...,0) T 
1 T A- T - ~ T 



(26) 
(27) 



(28) 

Hence, a = {C^)n = (E[© c ©jT ]) _1 = er" 1 is the inverse 
of the expected variance of the CMB template. As a conse- 
quence, if the random process that generates the template 
© c is uncorrelated with those that generate the templates 
{&j} (in general, this is a reasonable assumption), i.e. if 



'6 



Ccc 





. 


. 





Oil 


012 ■ 









a N 2 ■ 





(29) 



from Eq. (f2"4|) and the fact that Cg 1 has the form 



'6 





... 0\ 









ci- 1 





J 



(30) 



with fi the bottom-right block of the matrix in the rhs of 
Eq. (f2"9")l . one obtains that 



©c 



©c 



(31) 



Here, no use is made of the operator E[.|.] since we are 
dealing with fixed realizations of the random processes 
that generate the CMB as well the Galactic components. 
Therefore, if matrix C s is known, the ILC solution is ex- 
act. 

This condition changes if matrix C g has to be esti- 
mated through Eq. (JT3J) and a general treatment of this 
problem is quite difficult. For this reason, we consider the 
case of observations that span a sky area much wider than 



the correlation lengths of the maps {5 ,< - l - ) }. Under this con 



dition, Ce 



ith 



/N p can be written in the form 



C& — C 



e 



AC 



e 



(32) 



/ §c 



V ScN„ 



Scl S C 2 



->cN a 



5ll $12 

SlN ^2N 



Sin, 
Sn n ) 



(33) 



a perturbing matrix with small zero-mean entries. Because 
of this, it can be expanded in Taylor series around C 6 ' 
up to the linear term obtaining 



or 



(34) 



' e 



— 1 


- °6 


Cg'ACgCg 1 , 


( ^ - 










n 1 n x ©o 1 



(35) 



with 6 = (5 c i,S C 2, . . . ,5 c n o ) t and the bottom-right 
block of the matrix in the rhs of Eq. (|33[) 0. From this 
result and from Eqs. (|24"]l. (|2T|) and ([25]). one obtains that 



©c 



(l,oe)©, 



or 



A© c = © c -© c = (0,^e)©, 
with (5e a row vector given by 



'6 



Tn-l 



-S'Cl 



Tn-l 



(36) 
(37) 

(38) 



Hence, not unexpectedly, the ILC solution differs from the 
true one by an amount that depends on the sample corre- 
lation between © c and the Galactic templates. 

3.2. Case N Q <N C + 1 

If N < N c + 1 , the number of the components (CMB 
included) is larger than the number of channels. Since in 
this case A is a rectangular N Q x (N c + 1) matrix, the 
inverse A~ 1 is not defined. Hence, the ILC solution © c as 
given by Eq. pip cannot be written in the form (|24|) . The 
only possibility for still having © c = © c is that 



al T {AC & 1 A T )- 1 A^el. 



(39) 



It is necessary to stress that the entries of ACe are not 
independent. The point is that if C is a symmetric, positive def- 
inite matrix, then the same holds for its inverse C -1 . However, 
in general, this is not true for a matrix C — C + AC obtained 
perturbing C with an arbitrary matrix. This means that the 
entries of AC have to satisfy certain conditions. If C is written 
in the form [(Q + AQ)(Q + AQ) T ], with Q = C 1/2 and AQ 
a matrix of zero-mean random quantities, and then expanded 
expanded in Taylor series, one can find the same result as in 
Eq. (3D with AC = (AQ)Q T + Q(AQ) T . 
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In the present case, there is no reason to expect that this 
condition be satisfied. As a consequence, the ILC solution 
differs from the true one by an amount given by 



A@ c = [al T {AC e A T )- 1 A - ef]6. 



(40) 



This result implies that, similarly to other techniques 
(e.g., Generalized Least Squares), it is risky to use the 
ILC technique in situations where the number of observ- 
ing channels is smaller than the number of components. 
Unfortunately, the importance of A© c depends on the 
specific characteristics of both A and C& and, hence, a 
general treatment is not possible. However, it is not dif- 
ficult to realize that when the Galactic component is the 
dominant one, it can even happen that A© c > © c . For 
example, in the hypothesis that Ce = c 2 7 and 



A = 



1.0 2.0 3.0 
1.0 2.5 3.5 



Eq. (gOj) provides 

A© c = (-0.33, -0.33,0.33)©. 



(41) 



(42) 



This example does not illustrate a situation excessively un- 
favorable for the separation. In fact, close to the Galactic 
plane the CMB component is expected to be largely dom- 
inated by the Galactic emissions. 

3.3. Case N Q > N c + 1 

If N > iV c +l, the number of components (CMB included) 
is smaller than the number of channels. As in the previous 
section, also in this case A is N Q x (N c + 1) rectangular 
matrix but now we face a more difficult situation. In fact, 
the ILC solution © c cannot be written in the form (fTTj) 
since matrix Cs is singular. The only way out is to resort 
to the pseudo-inverse Cg 



& s = VDW 1 



(43) 



Here, V is the orthogonal matrix obtained from the sin- 
gular value decomposition of Cs, 



C s = VDV 1 , 



(44) 



is a diagonal matrix whose entries are given by = 
gQ 1 if da 7^ 0, otherwise, and {da} are the elements of 
the diagonal matrix D. In this case, the ILC solution is 
given by 

(45) 



© c = al T ClS, 



where 



[i T cU]-K 



(46) 



If A is a full column-rank matrix, then Eq. (|45[) can be 
rewritten in the form 



where 



© c = al T (AtfCe 1 ©, 



a=[l T {A^C- e 1 A^l}-\ 



(47) 
(48) 



and 



and 



At = (A T A)- 1 A T , 
(AY = A{A t A)-\ 



(49) 



(50) 

Now, since it is trivially verified that 1 T = e^A T , one 
obtains that 



l 1 {A 



(51) 



and then, from Eq. (|4"B")) . © c — © c . In other words, if 
matrix Cs is known, also in this case the ILC method 
provides an exact separation. When Cs has to be used, 
results similar to those presented in Sect. l3.1l are obtained. 

3.4. Comparison with the Fast lea implementation of 
the Independent Component Analysis (ICA) 

One conclusion that can be drawn from the arguments 
presented in the previous sections is that ILC suffer sev- 
eral drawbacks. However, this techniques is not the only 
one available for the blind separation of CMB from the 
Galactic foregrounds. One of the most celebrated com- 
petitor is the Fastlca implementation of the independent 
component analysis (ICA). This method works explicitly 
under model (fll?)) with N a = N c + 1, i.e. matrix A is 
square, and noiseless data. The most interesting charac- 
teristic of ICA is that, unlike ILC, this technique is able 
to provide an estimate not only of © c but also of all the 
Galactic components {©j}- For this reason, a comparison 
of the two methods is of interest. 

The basic idea behind ICA is rather simple (and obvi- 
ous) : to obtain the separation of the components it is suffi- 
cient to have an estimate of matrix A. In fact, © = A~ 1 S. 
Now, if the CMB component and the Galactic ones are 
mutually uncorrelated (i.e. if E[©© T ] = I), then 



SS T /N P = A A 1 



(52) 



This system of equations define A at least of a orthogonal 
matrix. In fact, if A = ZV, with V orthogonal, then 
SS T = AA T = ZVV T Z T = ZZ T . The problem is that, 
given the symmetry of SS J , system (|52p contains only 
N (N + l)/2 independent equations, but the estimates of 
N% quantities should be necessary. In ICA, the N (N — 
l)/2 missing equations are obtained by imposing that the 
components © are not only mutually uncorrelated but 
also mutually independent. In other words, the separation 
problem is converted into the form 



© = argmini r '(©), 
e 



(53) 



subject to & = A- 1 S and SS T /N p = AA T ', (54) 

with F(&) a function that measures the degree of in- 
dependence between the components ©. The definition 
of a reliable measure F(.) is not a tri vial task. In litera- 



ture, various choices are available (see iHvvarinnen et al. 



20011 and reference therein). In practical algorithms, the 



optimization problem is not implemented explicitly in 



(i 
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the form ([5^1) - ([5^|) . Typically, a first estimate ©» is ob- 
tained through a principal component analysis (PCA) step 
followed by a sphering operation (i.e. normalization to 
unit variance). In this way, a set of uncorrelated and 

normalized components are obtained, i.e. ©*©„ = I. 
Later, this estimate is iteratively refined to maximize 
F(&) and get the final ©. Again, in literature, various 
techniques are availabl e. FastICA exploits a fixed-point 



optimization approach flHyvarinnen et al. 1 120011 : iMaino 



2002; B accigalupi et al. T 2004 ) . An alternative technique 



is JA DE that makes use of the joint diagonalizing algo- 



rithm (|Cardoso Ill999h 



As stated above, the main advantage of ICA is its capa- 
bility to separate all the components of a given mixture, 
a property not shared by ILC. Its main limit is the re- 
quirement of mutually independence of the components. 
This is a much stronger requirement than the uncorre- 
latedness of the the CMB templates from the Galactic 
ones that is required by ILC. In particular, the various 
Galactic components are expected to be somewhat corre- 
lated. Another limit of some popular implementations of 
ICA is that the number N a has to be equal to N c + 1. 
IfJV e _<jy g _+l then no solution is possible (however, see 



Hvvarinnen et al. ifeoOlf ). If N > N c + 1, then the number 
N c has to be known in advance and a dimension reduc- 
tion operated. In the case of noise-free observations, N c 
can be obtained from the number of non-zero eigenvalues 
of matrix C s that are calculated during the PCA step. 
Using PCA it is also possible to operate the above men- 
tioned dimension reduction. If noise is present, however, 
the identification of the non-zero eigenvalues and hence 
the determination of N c may become difficult. This kind 
of problem is more serious than for ILC which does not 
require the exact value of N c but only that N a > N c + 1 . 

Since problem (|55)) - ([M|) is a nonlinear one, a more de- 
tailed statistical characterization of the ICA properties 
represents a quite difficult task. In any case, from the con- 
siderations above, one has not to expect that ICA could 
outperform ILC. On the other hand, given its important 
limits, ILC cannot be expected to offer much better per- 
formances. 

3.5. Some Numerical Experiments 

To support the results numerical experiments are pre- 
sented here. In these simulations we have deliberately cho- 
sen non-astronomical "deterministic" subjects. The reason 
is that in this way a direct visualization of the separation 
provided by ILC and ICA is possible as well as a sim- 
pler modeling of particular experimental conditions (e.g., 
the spurious correlation between different images due to 
their finite sizes). The fact that in the previous sections 
the CMB as well as the Galactic components are assumed 
to be the realization of stationary random processes, does 
not represent a limit since the arguments have been devel- 
oped assuming fixed realizations of those processes. This 
permits to interpret the separation as an operation in a de- 



terministic framework. The only difference is represented 
by the fact that, for genuine deterministic signals, the var- 
ious statistical quantities and operators lose their statis- 
tical meaning. For example, the cross-covariance matrix 
Cs coincides with Cs and becomes a measure of the co- 
herence between signals. Something similar holds for the 
other operators. 

In the first experiment, whose results are shown in 
Figs. QJ21 the performance of both ILC and ICA is tested 
under favorable conditions. In particular, three mixtures 
(Fig. [2]) have been simulated through model (Tl9j) using 
an equal number of component images (Fig. [1]) that are 
almost uncorrelated with each other. The mixing matrix 
A (|2"Tj) is given in Fig. [TJ The quality of the separation in 
Fig. [3] is excellent. This result can be ascribed to the small 
values of the off-diagonal entries of the cross-covariance 
matrix as supported by Figs.[4][6]that present the results of 
an experiment similar to the previous one. Here, however, 
a spurious dependence between the components of the 
mixtures is mimicked using three images having almost- 
constant luminosity areas in correspondence to the same 
coordinates (Fig. [4|) . Such a spurious dependence is high- 
lighted by the (normalized) cross-covariance matrix Ce 



1.00 0.20 0.06 
0.20 1.00 0.06 
0.06 0.06 1.00 



(55) 



The off-diagonal entries are much larger than zero and 
the separation provided Fig. [6] by both the methods is 
undoubtedly worse. 

The aim of the last experiment is to test the result 
of Sec. 13.21 that ILC is not able to provide reliable results 
when the number of observed mixtures is smaller than the 
number of the corresponding components. In this respect, 
Fig. [7] offers an illuminating example. This experiment is 
again similar to the one corresponding to Figs. [IKl with 
the difference that only two mixtures are available with 
the mixing matrix 



A = 



1.0 2.0 1.0 
1.0 1.0 2.0 



(56) 



The different quality of the separation does not deserve 
comment. Indeed, from Eq. (|40|) it is 

(57) 



A© c = (0,1.19,1.87)©, 



i.e. a quantity whose magnitude is comparable to the im- 
age of interest (see also the bottom- right panel in Fig. [7|) . 
Not unexpectedly, ICA as well seems to be unable to pro- 
vide an acceptable separation. 

4. STATISTICAL CHARACTERISTICS OF ILC: 
NOISY OBSERVATIONS 

In situations of maps contaminated by measurement 
noise A/", assumed zero-mean, additive, and stationary, 
model (Tl9l) becomes 



S = A&+M. 



(58) 
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In turn, under the condition of Af uncorrelated with S, 
Cg becomes 



AC&A 



with 



Sl A f = E[J\Tj\f T ]. 



(59) 



(60) 



If noise is small and N a = N c + 1, when Eq. (|59[) is in- 
serted in Eq. (|lip and the resulting expression expanded 
in Taylor series up to the linear term, it is possible to see 
that 

6c = a*[l r - 7 ]l- T Cg 1 6, (61) 

with 7 = (Tcc e T and a* a constant coefficient 
that depends on ilj^f and converges to a for decreasing 
level of the noise. A similar result holds when iV > N c + 1 
with (j4^) t substituting A~ T . From Eq. (|6"Tj) it is clear 
that E[© C |A@] ^ <3 C , i.e. the ILC estimator is biased. 
Since 7 depends on the specific characteristics of A and 
Sl/s, this does not allow a general treatment of the ques- 
tion. However, it is not difficult to realize that the bias 
affecting & c can be severe (e.g., see the top-left panel of 
Fig. [HI). 

It is possible to obtain an unbiased ILC estimator com- 
puting the weights w by means of Eq. (| 1 1 1) with the cross- 
covariance matrix Cs substituted by 



C* s — Cs — 



(62) 



This operation, however, has a cost: the weights w do not 
minimize any longer the variance of {G c (p)}. As a con- 
sequence, the influence of noise can be even dramatically 
amplified. This fact is clearly visible in the top panels of 
Figs. [U that compare the results obtained by the ILC es- 
timator based on Cs and C s = C s — respectively. 
Two mixtures have been simulated using the two figures 
in the top panels of Fig. [1] and the mixing matrix 



1.00 1.00 
1.00 0.95 



(63) 



A Gaussian, white-noise process has been added to these 
mixtures with a signal-to-noise-ratio (SNR) equal to 20. 
The reason of such unpleasant result has to be searched 
in the condition number, k(C* s ), of C* s that tends to be 
larger than k(Cs)- Because of this, the entries of (C^) -1 
can take values in a range much wider than those of Cg 1 
and the same holds for the corresponding weigh ts w. In 



fact, according to a theorem due to Weyl (e.g., see iBhatia 
1997j,Theorem III.2.2), it is 



Ai (C s ) - \i{Vt^) < Ai(C£) < Ai(C s ) - AAr o (rw), 

(64) 

XnACs) ~ Ai(fiAf) < Aat o (C^) < X No (Cs) - XnA^at), 

(65) 



4 Here, SNR is defined as the ratio of the standard deviation 
of the signal with respect to the standard deviation of the 
corresponding noise. 



where A,; (H) denotes the i-th eigenvalues of a N x N sym- 
metric, positive definite matrix H with Xi > X 2 . . . > 
A 7v > 0. These inequalities indicate an high probability 
that X No (C s )/X No (C s ) < A 1 (C|)/A 1 (C S ). In this way, 



«(C* ) = ^ 1<yC * s ^ > A i(Cs) _ K r Cs \ 
Ajv (Cg) Xn„(Cs) 



(66) 



For example, this is strictly true if J7jv" is a diagonal matrix 
with non-zero entries equal to a constant value g 7 since 



XnA^s) - Q 



(67) 



From these consideration, in the case of low SNR and large 
«;(Cg), one has to expect important amplification of the 
noise for the unbiased ILC estimator. 

The correctness of this argument is supported by 
Figs. |9"1TT2"1 that regard an experiment similar to that cor- 
responding to Fig. [5] These figures show the root-mean- 
square (rms) of the residuals corresponding to the biased, 
respectively, unbiased ILC estimators for SNR = 5, 100 
and various values of k. Different values of this last pa- 
rameter has been obtained assuming a mixing matrix with 
the form 

. /LOO 1.00 \ f . 

= { 1.00 a 21 ) > ^ 

and making 021 to assume a set of evenly spaced values 
in the range [0,3]. When 021 = 1, matrix A becomes sin- 
gular. An interesting indication that come out from these 
figures is that, in the case of high SNR the unbiased ILC 
estimator provides remarkably worse result only when ma- 
trix C* s is very ill-conditioned. This may happen if the 
different images are very similar (observations performed 
at very similar frequencies). This suggests that, if n(C* s ) 
is not too large, the unbiased ILC estimator could be used 
with the benefit that the filtering a noise is typically an 
easier operation than the removal of a bias. In any case, 
even in the case C* s be ill-conditioned, as it happens when 
N Q > N c + 1, its k can be reduced by means of the SVD 
technique presented in Sec. 13.31 

It is necessary to stress that all of these conclusions 
hold only in situations of isotropic noise (i.e., a noise with 
identical characteristics everywhere). In the contrary case, 
bad results have to be expected since noise behaves as a 
sort of additional channel-dependent component. 

5. CONCLUSIONS 

The arguments presented in the previous sections show 
that for a safe use of the ILC estimator some conditions 
have to be satisfied. In particular: 

1. The observations have to cover a sky area much wider 
than the spatial scale (correlation length) of the ob- 
served maps in such a way to allow an accurate esti- 
mate of the cross-covariance matrix Cs; 

2. Model ([Tg ]) -([2"T ]1 has to hold. In particular, ILC can- 
not be expected to produce satisfactory results if 
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some of the Galactic templates depend on the ob- 
serving channels. This point can be realized if a 
Galactic channel-dependent template &j is thought 
as the sum of Nj channel- independent templates. In 
this case, model (Tl9|) - (l2T1) still holds but with an ef- 
fective number of Galactic components that now is 
N* — Ylj=i Nj- Since the number of observing chan- 
nel is typically rather limited, in practical applications 
it has to be expected that N a < N* + 1 and then the 
point below applies; 

3. The condition N Q > N* + l has to hold, i.e. the number 
of the observing channels has to be equal or larger 
than the number of the physical components (CMB 
included) that contribute to form the observed maps. 
In the contrary case, the solution can suffer a severe 
distortion; 

4. The level of the noise has to be rather low otherwise 
a severe bias can be introduced in the ILC estimator. 
This can be removed but at the price of a possible re- 
markable amplification of the noise influence. However, 
especially in situations of high SNR, the use of the un- 
biased ILC estimator can offer some advantages. 

From these points it is not difficult to realize that the ILC 
estimator is not trivial to use. In particular, the first two 
points conflict each other. In fact, in the case of wide maps, 
model (fl9l) - ([2T|) is not applicable since on large spatial 
scales the Galactic templates are expected to depend on 
the observing channel. In this respect, a simple tests can 
be of help that is based on the analysis of the eigenvalues 
of Cs- In fact, if N a > N* + 1, matrix Cs has to be 
(almost) singular with a number of (almost) zero elements 
in the diagonal matrix D of Eq. (@H) equal to N a — N* — 
1. Unfortunately, this test can be thwarted by the noise; 
because of the statistical fluctuations, the entries of D 
that should be close to zero can take larger value. 

The obvious conclusion is that, in order to obtain reli- 
able results, the use of ILC requires a careful planning of 
the observations. The area of the sky to observe as well 
as the tolerable level of the noise are factors that have to 
be fixed in advance. To try an "a posteriori' correction 
of the distortions introduced in the ILC solution by the 
violation of the above conditions is a quite risky opera- 
tion. The question is that all of these distortions critically 
depend on the true solution that one tries to estimate and 
thereof they cannot be obtained from the data only. The 
alternative represented by the numerical simulations that 
make use of semi-empirical templates is quite risky. In par- 
ticular, there is the concrete possibility to force spurious 
features in the final results. This holds also in the case 
these results are used as "prio^' in more sophisticated 
separation techniques. 

A last comment regards the use ILC in the Fourier 
domain. Since this method provides the same result inde- 
pendently of the domain, it could seem that there is no 
particular benefit to work in the Fourier one. Actually, 
this can be not true if the maps have different spatial res- 
olutions (i.e. the observing channels have different point 



spread functions) and /or whether a frequen cy-dependent 
separation is desired ( Tegmark et al. 2003l h Although in 
this way it is possible to improve some properties of the 
separated maps (e.g., the spatial resolution), it is neces- 
sary to stress that this has a cost in the amplification of 
the noise level. 

Acknowledgements. We thank Dr. Carlo Baccigalupi for care- 
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Original N.1 Original N.2 
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50 100 150 200 250 



Fig. 1. Original images © - see Eq. (IT9|) - used in the first experiment described in the text (i.e., separation of almost 
uncorrelated components) to test both ILC and ICA (FastICA algorithm) techniques. The top-left panel shows the 
image to recover. All the images, before the mean subtraction, have values in the interval [0,1]. The bottom-right 
panel provides the mixing matrix A - see Eq. (|2ip . 
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Observed N.1 Observed N.2 




Observed N.3 Reconstructed Image (ILC) 




Fig. 2. The three mixtures (observed images) S = A& used in the first experiment to test the ILC and ICA (FastICA 
algorithm). Here, the experiment mimics three independent observing channels and the separation in the case of three 
almost uncorrelated components. The bottom-right panel displays the resulting "restored" image obtained with ILC. 
As seen the ILC method retrieves the input image quite well. 
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Reconstructed Image (ICA) Reconstructed Image (ICA) 




Reconstructed Image (ICA) Reconstructed Image (ILC) 




Fig. 3. Comparison between the results obtained with ICA (FastICA algorithm) and ILC concerning the mixtures in 
Fig. [21 The first three panels show the three components reconstructed by the ICA algorithm. The bottom right panel 
shows the result obtained with ILC, already shown in Fig. [2] 
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Fig. 4. Original images & used in the second experiment described in the text (i.e., separation of partially correlated 
components) to test the ILC and ICA (FastICA algorithm) techniques. The top-left panel shows the image to recover. 
Here a spurious correlations among the images is introduced (as often the case in foreground components of the 
CMB). These images before the mean subtraction, have values in the interval [0, 1]. The bottom-right panel provides 
the mixing matrix A. 
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Observed N.1 Observed N.2 




Observed N.3 Reconstructed Image (ILC) 




Fig. 5. The three mixtures (observed images) S = A& used in the second experiment to test the ILC and ICA 
(FastICA algorithm). Here, the experiment mimics three independent observing channels and the separation in the 
case of three almost uncorrelated components. The bottom-right panel displays the resulting "restored" image obtained 
with ILC. As seen the method is not able to fully retrieve the input image. 
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Fig. 6. Comparison between the results obtained with ICA (Fastlca algorithm) and ILC concerning the mixtures in 
Fig. El The first three panels show the three components reconstructed by the ICA algorithm. The bottom right panel 
shows the result obtained with ILC, already shown in Fig. [5] This figure has to be compared with Fig. [3] 
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Reconstructed Image N.1 (ICA) Reconstructed Image N.2 (ICA) 




Reconstructed Image (ILC) ILC Solution - True Solution 




50 100 150 200 250 50 100 150 200 250 



Fig. 7. Resulting images as in the experiment shown in Figs.[T][3] In this case the number of components affecting the 
signal is larger than the number of mixtures (observed images). The case shown in this figure corresponds to three 
original input images and only two mixtures available. The bottom-right panel shows the residuals, A(5 C = & c — & c , 
in the ILC method. These residuals are of the same order of magnitudes as the input images and therefore the method 
is not able to retrieve the original components. This figure has to be compared with Fig. [3J 
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Biased ILC (noisy mixture) Unbiased ILC (noisy mixture) 




Biased ILC (noise-free mixture) Unbiased ILC (noise-free mixture) 




Fig. 8. The figure shows the results of noise effects for the biased, respectively, unbiased ILC estimator as described 
in the text (see Sec. |4j: Top panels - Comparison between the results obtained with the two estimators when to 
the mixtures a Gaussian, zero-mean, white-noise process with SNR = 20 is added. The presence of the bias for the 
biased estimator and the amplification of the noise for the unbiased one are evident. Lower panels - Results obtained 
when the same estimators are applied to the noise- free version of the mixtures. This is to better characterize the bias 
introduced by the biased estimator and the unbiasedness of the other one. 
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SNH = 5 SNR = 100 
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Fig. 9. Experiment showing the effect of noise: RMS of 
the residuals for the biased, respectively, unbiased ILC es- 
timators vs. a2i when SNR = 5. Here, 021 = 1 corresponds 
to a singular mixing matrix A. The dip in correspondence 
to this value is a numerical effect do to the use of the 
sample C s . 



SNR = 5 




0.5 1 1.5 2 2.5 3 



Si 

Fig. 10. Experiment concerning the influence of the noise: 
condition number k of Cg, respectively, C s vs. 021 when 
SNR = 5. Here, 021 = 1 corresponds to a singular mixing 
matrix A. 



Fig. 11. Experiment showing the effect of noise: RMS of 
the residuals for the biased, respectively, unbiased ILC 
estimators vs. 021 when SNR = 100. Here, 021 = 1 corre- 
sponds to a singular mixing matrix A. 



SNR = 100 




Si 

Fig. 12. Experiment showing the effect of noise: condition 
number k of Cs, respectively, C s vs. 021 when SNR = 
100. Here, a- 2 \ — 1 corresponds to a singular mixing matrix 
A. 



